Enrich: software for analysis of protein function by enrichment and depletion of variants

نویسندگان

  • Douglas M. Fowler
  • Carlos L. Araya
  • Wayne Gerard
  • Stanley Fields
چکیده

SUMMARY Measuring the consequences of mutation in proteins is critical to understanding their function. These measurements are essential in such applications as protein engineering, drug development, protein design and genome sequence analysis. Recently, high-throughput sequencing has been coupled to assays of protein activity, enabling the analysis of large numbers of mutations in parallel. We present Enrich, a tool for analyzing such deep mutational scanning data. Enrich identifies all unique variants (mutants) of a protein in high-throughput sequencing datasets and can correct for sequencing errors using overlapping paired-end reads. Enrich uses the frequency of each variant before and after selection to calculate an enrichment ratio, which is used to estimate fitness. Enrich provides an interactive interface to guide users. It generates user-accessible output for downstream analyses as well as several visualizations of the effects of mutation on function, thereby allowing the user to rapidly quantify and comprehend sequence-function relationships. AVAILABILITY AND IMPLEMENTATION Enrich is implemented in Python and is available under a FreeBSD license at http://depts.washington.edu/sfields/software/enrich/. Enrich includes detailed documentation as well as a small example dataset. CONTACT [email protected]; [email protected] SUPPLEMENTARY INFORMATION Supplementary data is available at Bioinformatics online.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comprehensive Computational Analysis of Protein Phenotype Changes Due to Plausible Deleterious Variants of Human SPTLC1 Gene

Genetic variations found in the coding and non-coding regions of a gene are known to influence the structure as well as the function of proteins. Serine palmitoyltransferase long chain subunit 1 a member of α-oxoamine synthase family is encoded by SPTLC1 gene which is a subunit of enzyme serine palmitoyltransferase (SPT). Mutations in SPTLC1 have been associated with hereditary sensory and auto...

متن کامل

Molecular Identification of Pre-Existing Immunityin Human against H9N2 Influenza Viruses Using HLA-A*0201 Binding Peptides

Background and Aims: The contribution genetic and antigenic diversity of H9N2 influenza viruses in evading from immune responses, cytotoxic T lymphocytes (CTL) epitopes in hemagglutinin (HA) protein restricted by HLA binding peptides was identified. Materials and Methods: Phylogenetic analyses were carried out for all of full length HA and deduced amino acid sequences of H9N2 viruses available ...

متن کامل

Biotechnology Workshop, a Model for Curriculum Enrichment: Investigating Medical Students' Viewpoints

Introduction: In recent years, one of the ways for enriching medical students' curriculum and training them as physician-scientists has been acquainting them with other sciences such as biotechnology. In this study conducted as a workshop for students' acquaintance with biotechnology, medical students' viewpoints towards its advantages was investigated. Methods: A seven day biotechnology works...

متن کامل

Implementation and Optimization of Annotation and Interpretation Step of Next-Generation Sequencing Data for Non-Syndromic Autosomal Recessive Hearing Loss

Introduction: The precision and time required for analysis of data in next-generation sequencing (NGS) depends on many factors including the tools utilized for alignment, variant calling, annotation and filtering of variants, personnel expertise in data analysis and interpretation, and computational capacity of the lab and its optimization is a challenging task.  Method: An application software...

متن کامل

Implementation and Optimization of Annotation and Interpretation Step of Next-Generation Sequencing Data for Non-Syndromic Autosomal Recessive Hearing Loss

Introduction: The precision and time required for analysis of data in next-generation sequencing (NGS) depends on many factors including the tools utilized for alignment, variant calling, annotation and filtering of variants, personnel expertise in data analysis and interpretation, and computational capacity of the lab and its optimization is a challenging task.  Method: An application software...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 27  شماره 

صفحات  -

تاریخ انتشار 2011